28 research outputs found
Recommended from our members
Predicting multibody assembly of proteins
textThis thesis addresses the multi-body assembly (MBA) problem in the context of protein assemblies. [...] In this thesis, we chose the protein assembly domain because accurate and reliable computational modeling, simulation and prediction of such assemblies would clearly accelerate discoveries in understanding of the complexities of metabolic pathways, identifying the molecular basis for normal health and diseases, and in the designing of new drugs and other therapeutics. [...] [We developed] F²Dock (Fast Fourier Docking) which includes a multi-term function which includes both a statistical thermodynamic approximation of molecular free energy as well as several of knowledge-based terms. Parameters of the scoring model were learned based on a large set of positive/negative examples, and when tested on 176 protein complexes of various types, showed excellent accuracy in ranking correct configurations higher (F² Dock ranks the correcti solution as the top ranked one in 22/176 cases, which is better than other unsupervised prediction software on the same benchmark). Most of the protein-protein interaction scoring terms can be expressed as integrals over the occupied volume, boundary, or a set of discrete points (atom locations), of distance dependent decaying kernels. We developed a dynamic adaptive grid (DAG) data structure which computes smooth surface and volumetric representations of a protein complex in O(m log m) time, where m is the number of atoms assuming that the smallest feature size h is [theta](r[subscript max]) where r[subscript max] is the radius of the largest atom; updates in O(log m) time; and uses O(m)memory. We also developed the dynamic packing grids (DPG) data structure which supports quasi-constant time updates (O(log w)) and spherical neighborhood queries (O(log log w)), where w is the word-size in the RAM. DPG and DAG together results in O(k) time approximation of scoring terms where k << m is the size of the contact region between proteins. [...] [W]e consider the symmetric spherical shell assembly case, where multiple copies of identical proteins tile the surface of a sphere. Though this is a restricted subclass of MBA, it is an important one since it would accelerate development of drugs and antibodies to prevent viruses from forming capsids, which have such spherical symmetry in nature. We proved that it is possible to characterize the space of possible symmetric spherical layouts using a small number of representative local arrangements (called tiles), and their global configurations (tiling). We further show that the tilings, and the mapping of proteins to tilings on arbitrary sized shells is parameterized by 3 discrete parameters and 6 continuous degrees of freedom; and the 3 discrete DOF can be restricted to a constant number of cases if the size of the shell is known (in terms of the number of protein n). We also consider the case where a coarse model of the whole complex of proteins are available. We show that even when such coarse models do not show atomic positions, they can be sufficient to identify a general location for each protein and its neighbors, and thereby restricts the configurational space. We developed an iterative refinement search protocol that leverages such multi-resolution structural data to predict accurate high resolution model of protein complexes, and successfully applied the protocol to model gp120, a protein on the spike of HIV and currently the most feasible target for anti-HIV drug design.Computer Science
Protein-Protein Docking with F2Dock 2.0 and GB-Rerank
Rezaul Chowdhury is with UT Austin; Muhibur Rasheed is with UT Austin; Maysam Moussalem is with UT Austin; Donald Keidel is with The Scripps Research Institute; Arthur Olson is with The Scripps Research Institute; Michel Sanner is with The Scripps Research Institute; Chandrajit Bajaj is with The Scripps Research Institute.Motivation -- Computational simulation of protein-protein docking can expedite the process of molecular modeling and drug discovery. This paper reports on our new F2 Dock protocol which improves the state of the art in initial stage rigid body exhaustive docking search, scoring and ranking by introducing improvements in the shape-complementarity and electrostatics affinity functions, a new knowledge-based interface propensity term with FFT formulation, a set of novel knowledge-based filters and finally a solvation energy (GBSA) based reranking technique. Our algorithms are based on highly efficient data structures including the dynamic packing grids and octrees which significantly speed up the computations and also provide guaranteed bounds on approximation error. Results -- The improved affinity functions show superior performance compared to their traditional counterparts in finding correct docking poses at higher ranks. We found that the new filters and the GBSA based reranking individually and in combination significantly improve the accuracy of docking predictions with only minor increase in computation time. We compared F2 Dock 2.0 with ZDock 3.0.2 and found improvements over it, specifically among 176 complexes in ZLab Benchmark 4.0, F2 Dock 2.0 finds a near-native solution as the top prediction for 22 complexes; where ZDock 3.0.2 does so for 13 complexes. F2 Dock 2.0 finds a near-native solution within the top 1000 predictions for 106 complexes as opposed to 104 complexes for ZDock 3.0.2. However, there are 17 and 15 complexes where F2 Dock 2.0 finds a solution but ZDock 3.0.2 does not and vice versa; which indicates that the two docking protocols can also complement each other. Availability -- The docking protocol has been implemented as a server with a graphical client (TexMol) which allows the user to manage multiple docking jobs, and visualize the docked poses and interfaces. Both the server and client are available for download. Server: http://www.cs.utexas.edu/~bajaj/cvc/softâware/f2dock.shtml. Client: http://www.cs.utexas.edu/~bajaj/cvc/softâware/f2dockclient.shtml.The research of C.B., R.C., M.M., and M.R. of University of Texas, was supported in part by National Science Foundation (NSF) grant CNS-0540033, and grants from the National Institutes of Health (NIH) R01-GM074258, R01-GM073087, R01-EB004873. The research of M.M. was additionally supported by an NSF Graduate Research Fellowship. The research of M.S. and A.O. of TSRI was supported in part by a subcontract on NIH grant R01-GM073087. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Computer Science
X-ray, Cryo-EM, and computationally predicted protein structures used in integrative modeling of HIV Env glycoprotein gp120 in complex with CD4 and 17b
AbstractWe present the data used for an integrative approach to computational modeling of proteins with large variable domains, specifically applied in this context to model HIV Env glycoprotein gp120 in its CD4 and 17b bound state. The initial data involved X-ray structure PDBID:1GC1 and electron microscopy image EMD:5020. Other existing X-ray structures were used as controls to validate and hierarchically refine partial and complete computational models. A summary of the experiment protocol and data was published (Rasheed et al., 2015) [26], along with detailed analysis of the final model (PDBID:3J70) and its implications
PF2fit: Polar Fast Fourier Matched Alignment of Atomistic Structures with 3D Electron Microscopy Maps.
There continue to be increasing occurrences of both atomistic structure models in the PDB (possibly reconstructed from X-ray diffraction or NMR data), and 3D reconstructed cryo-electron microscopy (3D EM) maps (albeit at coarser resolution) of the same or homologous molecule or molecular assembly, deposited in the EMDB. To obtain the best possible structural model of the molecule at the best achievable resolution, and without any missing gaps, one typically aligns (match and fits) the atomistic structure model with the 3D EM map. We discuss a new algorithm and generalized framework, named PF(2) fit (Polar Fast Fourier Fitting) for the best possible structural alignment of atomistic structures with 3D EM. While PF(2) fit enables only a rigid, six dimensional (6D) alignment method, it augments prior work on 6D X-ray structure and 3D EM alignment in multiple ways: Scoring. PF(2) fit includes a new scoring scheme that, in addition to rewarding overlaps between the volumes occupied by the atomistic structure and 3D EM map, rewards overlaps between the volumes complementary to them. We quantitatively demonstrate how this new complementary scoring scheme improves upon existing approaches. PF(2) fit also includes two scoring functions, the non-uniform exterior penalty and the skeleton-secondary structure score, and implements the scattering potential score as an alternative to traditional Gaussian blurring. Search. PF(2) fit utilizes a fast polar Fourier search scheme, whose main advantage is the ability to search over uniformly and adaptively sampled subsets of the space of rigid-body motions. PF(2) fit also implements a new reranking search and scoring methodology that considerably improves alignment metrics in results obtained from the initial search
Average rank of best RMSD result returned by PF<sup>2</sup><i>fit</i>âSE(3) after reranking.
<p>In the initial stage GCCS was used. The figures in brackets denote the rank in the presence of noise at SNR = 1. We see a strong decrease in rank for the skeleton-secondary structure score with and without noise while the mutual information score remains predictable across the range of resolutions. See the section on âDatasetsâ for a list of PDBs used to generate the synthetic maps used in this experiment. Note that even if the ranks of the best RMSD solution, on average across all experments, show no improvement over <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004289#pcbi.1004289.t001" target="_blank">Table 1</a> (mostly because GCCS already does an excellent job of ranking them)- the ranks actually improved for several of the experiments (73/318 for MIS, and 5/318 for MIS). Please see Section âThe performance of reranking increases with resolutionâ for details.</p
Schematic of representations used in our algorithms.
<p>(A) PDB schematic, showing the target volume <i>V</i><sub></sub> and the complementary volume <math><mrow><msub><mi>V</mi><mo></mo></msub><mo>ÂŻ</mo></mrow></math>. (B) 3D EM map schematic, showing the target volume <i>V</i><sub></sub> and the complementary volume <math><mrow><msub><mi>V</mi><mo>ÂŻ</mo><mo></mo></msub></mrow></math>. Detailed definitions can be found in the Materials and Methods section.</p
Comparison of PF<sup>2</sup><i>fit</i> with other software in subunit-assembly fitting.
<p>Fitting the PDB molecule</p